A parallel finite element surface fitting algorithm for data mining

نویسندگان

  • Peter Christen
  • Irfan Altas
  • Markus Hegland
  • Stephen G. Roberts
  • Kevin Burrage
  • Roger B. Sidje
چکیده

A major task in data mining is to develop automatic techniques to process and to detect patterns in very large data sets. Multivariate regression techniques form the core of many data mining applications. A common assumption is that the multivariate data is well approximated by an additive model involving only first and second order interaction terms. In this case high-dimensional nonparametric regression is reduced to the determination of a couple set of first and second order interaction terms, that is the determination of a coupled set of curves and surfaces. Thin plate splines provide a very good method to determine an approximating surface. Obtaining standard thin plate splines requires the solution of a dense linear system of equations of order n, where n is the number of observations. For data mining applications the number of observations is often in the millions, so standard thin plate splines may not be practical. We have developed a finite element approximation of a spline that can handle data sizes with millions of records. The resolution of the finite element method can independently be chosen from the number of observations. The observation data can be read from a secondary storage once, and does not need to be stored in memory. In this paper, we discuss the parallel implementation of this method in an MPI environment.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scalable parallel algorithms for surface fitting and data mining

This paper presents scalable parallel algorithms for high dimensional surface fitting and predictive modelling which are used in data mining applications. These algorithms are based on techniques like finite elements, thin plate splines, wavelets and additive models. They all consist of two steps: First, data is read from secondary storage and a linear system is assembled. Secondly, the linear ...

متن کامل

Parallelization of a finite element surface fitting algorithm for data mining

Amajor task in data mining is to develop automatic techniques to process and to detect patterns in very large data sets. An important data mining technique is multivariate regression, and an essential sub task is the estimation of interaction surfaces, i.e. the estimation of functions of two variables. Thin plate splines provide a very good method to determine an approximating surface. Obtainin...

متن کامل

Speeding up the Stress Analysis of Hollow Circular FGM Cylinders by Parallel Finite Element Method

In this article, a parallel computer program is implemented, based on Finite Element Method, to speed up the analysis of hollow circular cylinders, made from Functionally Graded Materials (FGMs). FGMs are inhomogeneous materials, which their composition gradually varies over volume. In parallel processing, an algorithm is first divided to independent tasks, which may use individual or shared da...

متن کامل

Extended finite element simulation of crack propagation in cracked Brazilian disc

The cracked Brazilian disc (CBD) specimen is widely used in order to determine mode-I/II and mixed-mode fracture toughness of a rock medium. In this study, the stress intensity factor (SIF) on the crack-tip in this specimen is calculated for various geometrical crack conditions using the extended-finite element method (X-FEM). This method is based upon the finite element method (FEM). In this m...

متن کامل

Comprehensive Parametric Study for Design Improvement of a Low-Speed AFPMSG for Small Scale Wind-Turbines

In this paper, a comprehensive parametric analysis for an axial-flux permanent magnet synchronous generator (AFPMSG), designed to operate in a small-scale wind-power applications, is presented, and the condition for maximum efficiency, minimum weight and minimum cost is deduced. Then a Computer-Aided Design (CAD) procedure based on the results of parametric study is proposed. Matching between t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999